AITopics | representation vector

Trajectory generation has recently drawn growing interest in privacy-preserving urban mobility studies and location-based service applications. Although many studies have used deep learning or generative AI methods to model trajectories and have achieved promising results, the robustness and interpretability of such models are largely unexplored. This limits the application of trajectory generation algorithms on noisy real-world data and their trustworthiness in downstream tasks. To address this issue, we exploit the regular structure in urban trajectories and propose a deep generative model based on the pathlet representation, which encode trajectories with binary vectors associated with a learned dictionary of trajectory segments. Specifically, we introduce a probabilistic graphical model to describe the trajectory generation process, which includes a Variational Autoencoder (VAE) component and a linear decoder component. During training, the model can simultaneously learn the latent embedding of pathlet representations and the pathlet dictionary that captures mobility patterns in the trajectory dataset. The conditional version of our model can also be used to generate customized trajectories based on temporal and spatial constraints. Our model can effectively learn data distribution even using noisy data, achieving relative improvements of $35.4\%$ and $26.3\%$ over strong baselines on two real-world trajectory datasets. Moreover, the generated trajectories can be conveniently utilized for multiple downstream tasks, including trajectory prediction and data denoising. Lastly, the framework design offers a significant efficiency advantage, saving $64.8\%$ of the time and $56.5\%$ of GPU memory compared to previous approaches.

artificial intelligence, machine learning, trajectory, (19 more...)

arXiv.org Artificial Intelligence

2511.16105

Country:

Asia > China (0.30)
North America > United States (0.29)

Genre: Research Report (1.00)

Industry: Transportation (0.97)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.54)

Add feedback

Neural Trees for Learning on Graphs

Neural Information Processing SystemsOct-9-2025, 16:30:17 GMT

In this work, we propose a new GNN architecture - the Neural Tree .

artificial intelligence, graph, machine learning, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

B.2 Hierarchical k -means for semantic identifier The pseudo code of hierarchical k -means is detailed in in Algorithm 1. Algorithm 1: Hierarchical k -means.

Add feedback

Weakly Supervised Vulnerability Localization via Multiple Instance Learning

Gu, Wenchao, Chen, Yupan, Wang, Yanlin, Zhang, Hongyu, Gao, Cuiyun, Lyu, Michael R.

arXiv.org Artificial IntelligenceSep-16-2025

Software vulnerability detection has emerged as a significant concern in the field of software security recently, capturing the attention of numerous researchers and developers. Most previous approaches focus on coarse-grained vulnerability detection, such as at the function or file level. However, the developers would still encounter the challenge of manually inspecting a large volume of code inside the vulnerable function to identify the specific vulnerable statements for modification, indicating the importance of vulnerability localization. Training the model for vulnerability localization usually requires ground-truth labels at the statement-level, and labeling vulnerable statements demands expert knowledge, which incurs high costs. Hence, the demand for an approach that eliminates the need for additional labeling at the statement-level is on the rise. To tackle this problem, we propose a novel approach called WAVES for WeAkly supervised Vulnerability Localization via multiplE inStance learning, which does not need the additional statement-level labels during the training. WAVES has the capability to determine whether a function is vulnerable (i.e., vulnerability detection) and pinpoint the vulnerable statements (i.e., vulnerability localization). Specifically, inspired by the concept of multiple instance learning, WAVES converts the ground-truth label at the function-level into pseudo labels for individual statements, eliminating the need for additional statement-level labeling. These pseudo labels are utilized to train the classifiers for the function-level representation vectors. Extensive experimentation on three popular benchmark datasets demonstrates that, in comparison to previous baselines, our approach achieves comparable performance in vulnerability detection and state-of-the-art performance in statement-level vulnerability localization.

data mining, machine learning, vulnerability localization, (18 more...)

arXiv.org Artificial Intelligence

2509.11312

Country:

Europe (1.00)
Asia > China > Guangdong Province (0.28)
North America > United States > Pennsylvania (0.28)
North America > Canada > British Columbia (0.28)

Genre: Research Report > New Finding (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(3 more...)

Add feedback

HERCULES: Hierarchical Embedding-based Recursive Clustering Using LLMs for Efficient Summarization

Petnehazi, Gabor, Aradi, Bernadett

arXiv.org Artificial IntelligenceSep-4-2025

The explosive growth of complex datasets across various modalities necessitates advanced analytical tools that not only group data effectively but also provide human-understandable insights into the discovered structures. We introduce HERCULES (Hierarchical Embedding-based Recursive Clustering Using LLMs for Efficient Summarization), a novel algorithm and Python package designed for hierarchical k-means clustering of diverse data types, including text, images, and numeric data (processed one modality per run). HERCULES constructs a cluster hierarchy by recursively applying k-means clustering, starting from individual data points at level 0. A key innovation is its deep integration of Large Language Models (LLMs) to generate semantically rich titles and descriptions for clusters at each level of the hierarchy, significantly enhancing interpretability. The algorithm supports two main representation modes: `direct' mode, which clusters based on original data embeddings or scaled numeric features, and `description' mode, which clusters based on embeddings derived from LLM-generated summaries. Users can provide a `topic\_seed' to guide LLM-generated summaries towards specific themes. An interactive visualization tool facilitates thorough analysis and understanding of the clustering results. We demonstrate HERCULES's capabilities and discuss its potential for extracting meaningful, hierarchical knowledge from complex datasets.

hercules, large language model, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2506.19992

Genre: Research Report > Experimental Study (0.67)

Technology: